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SOME  ISSUES  IN  PROGRAMING  MULT! -MINI -PROCESSORS* 


A.  Newel  I and  G.  Robertson 


INTRODUCTION 

Large  computer  systems  can  be  constructed  by  joining  together 
m^ny  minicomputers  — creating  what  can  be  called 
mu  I t i -m i n i -processors.  The  first  such  systems  are  just  reaching 
the  point  where  problems  of  programming  and  use  dominate  problems  of 
design  and  construction.  This  paper  attempts  to  share  some  of  our 
early  perceptions  about  what  these  problems  of  programming  and  use 
are.  It  also  allows  U9  to  capture  a historical  record  of  our  current 
v i ewpc i nt . 

We  are  not  the  architects  of  the  multiprocessors  we  will 
describe.  He  are  not  even  the  primary  systems  programmers,  who  create 
the  operating  system  and  operating  environment  within  which  the  user 
operates.  Ue  are  users  of  the  system.  But  we  are  not  arms- length 
users,  as  are  the  users  of  a typical  university  computation  center. 
For  to  use  such  a system  one  must  indeed  create  a special  programming 
system  on  it.  Thus  we  are,  shall  we  say,  systems  exploiters.  Ue 
are  just  coming  deeply  into  contact  with  our  multiprocessor.  Ue 
find  ourselves  facing  many  issues  of  how  to  exploit  the  system  and  to 
program  it  — of  how  to  make  it  yield  to  our  will. 

First  we  will  sketch  the  multiprocessors  that  we  are 
concerned  with.  There  are  only  two  of  them,  and  we,  the  authors,  are 
actually  working  on  only  one.  Uith  this  as  background,  we  will 
d i ecus  s seven  programming  issues. 

The  role  of  minicomputers  as  components  of  multiprocessor 
systems  is  quite  different  from  their  clrs9ical  role  as  laboratory 
computers.  Though  some  of  these  9even  issues  will  9eem  quite 
familiar  to  those  uh09e  world  is  the  on-line  laboratory  use  of 
computers,  some  of  them  wi  I seem  quite  foreign.  Hopefully, 
houever,  they  will  paint  an  interesting  picture  of  a use  of 
minicomputers  that  will  become  increasingly  common. 


* This  paper  was  given  as  an  invited  talk  at  the  1974  Conference  on 
the  On-Line  Use  of  Computers  in  Psychology. 


w 


MULTIPROCESSORS 


There  are  only  two  genuine  mu  1 1 i -mi ni -processors,  as  far  as 
ue  know,  though  there  may  oe  others  in  design.  A multiprocessor  is 
characterized  not  only  by  the  existence  of  many  processors,  but  by 
the  sharing  of  primary  memory,  i.e.,  the  processors  address  common 
memory.  This  3ets  them  apart  from  networks  of  computers,  which  have 
nu  ny  computers,  but  where  the  intercommunication  is  essentially  from 
secondary  memory  to  primary  memory,  i.e.,  each  computer  sees  all  the 
other  computers  as  peripheral  devices.  Multiprocessors  permit  a 
degree  of  computational  intimacy  not  available  with  networks. 


C.MMP:  THE  CMU  MULTI  -MINI  -PROCESSOR 

C.mmp  is  the  multiprocessor  at  the  Computer  Science 
Departmert  of  Carneg i e-Me I I on  University  lUulf,  1972].  As  shown  in 
Figure  1,  C.mmp  consists  of  16  PDP11  computers  connected  through  a 
crosspoint  switch  to  1G  primary-memory  ports.  Each  primary  memory  io 
2'T16  words,  for  a total  memory  of  a million  words.  Each  processor 
still  looks  like  a PDP11  with  a 1G  bit  word  and  an  address  space  of 
2T15  words  (actually,  2tl6  bytes).  In  fact,  a modest  modification 
mu9t  be  made  to  a processor  to  operate  within  the  system. 

Each  of  the  processors  can  lay  its  address  space  anywhere  in 
the  million  words  of  the  primary  memory.  It  does  so  through  an 
address  relocation  box  (Dmap  in  the  figure),  which  breaks  the  address 
space  of  the  processor  into  eight  4,096  word  pages.  Thus  the  system 
has  a small  number  of  large  pages,  each  of  which  may  be  independently 
relocated  through  Dmap. 

Each  Pc  has  its  own  Un'bus,  the  standard  bus  structure  of  the 
PDP11.  On  this  hang9  4K  of  local  memory  aB  well  as  all  the 
peripheral  gear  of  disks,  drums,  printers,  and  connections  to  the 
external  world.  The  last  includes  a connection  to  the  PDP10,  which 
is  the  large  general-purpose  time-shared  computer  system  in  the 
Computer  Science  Department.  As  the  figure  shows,  there  is  also  a 
large  (60-bit  1- microsecond)  clock  (K. clock),  which  provides  a common 
reference  frame,  and  an  interrupt  (K. interrupt)  which  connects  all 
processors.  There  is  at  the  moment  no  switching  between  secondary 
devices  and  the  various  processors.  A disk,  for  example,  is 
permanently  located  uith  one  Pc. 

Not  shown  in  the  ‘igures  is  the  ability  to  partition  Lhe 
9yster,i,  either  dynamically  or  statically  (manually),  so  that  it 
consists  of  independent  subsystems.  Thus  it  is  possible,  for 
example,  to  have  harduare  maintenance  going  on  at  the  same  time  that 
a user  system  is  operating  with  other  Pc’s  and  Mp’s. 

The  system,  though  made  up  of  minicomputers,  constitutes  a 
large  computer.  Taking  processors  to  be  1 1 /40s  yields  about  .3  to 
.4  million  instructions  per  second  (mips)  per  processor,  for  a total 
of  5 to  7 mips.  This  compares  approximately  with  an  IBM  360/158. 
There  must  be  some  contention  for  memory  as  the  number  of  processors 
increase,  but  this  is  not  expected  to  be  large  (the  Dmap’ s for  the 


1 1 /40s  contain  a cache). 


4. 


The  system  has  been  operational  in  parts  for  some  time.  The 
16x16  switch  has  been  running  since  March,  1974.  Ue  currently  have 
five  11/20  Pc’s  operational  with  500K  of  memory.  Ue  do  not  have  anu 
mod i f i ed  11 /40s. 


PLURIBUS  IMP:  THE  BBN  MULT  I -HI N! -PROCESSOR 

The  second  mu  1 1 i -mi ni -processor  system,  the  PLURIBUS  IMP 
[Heart.  L973)  , has  been  developed  by  Bolt,  Beranek  and  Newman  to 
serve  as  a high  speed  modular  HIP  (interface  message  processor)  for 
the  ARPA  computer  network.  Figure  2 shows  its  structure.  The 

processors  are  Lockheed  SUE  minicomputers,  which  are  16-bit  machines 
with  a 15-bit  word  address,  and  which  are  about  the  speed  of  the 
11/20.  They  have  a bus  structure  which  is  similar  to  the  DEC  Unibus. 
As  shown  in  the  figure,  two  processors  are  located  on  each  bus,  each 
with  4K  of  local  memory.  Thus  the  figure  illustrates  a 14-processor 
system,  the  maximum  sice  for  which  the  PLURIBUS  IMP  was  designed. 

(The  number  was  determined  by  the  app I . cat  i on,  not  by  hardware 

limits.)  The  primary  memories  come  in  8k'  units  with  two  units  on  each 

memory  bus.  The  switch  is  distibuted,  unlike  the  C.mmp  which  is  a 

monolithic  device.  Thus,  iinks  run  between  the  busts  of  the 
processors  and  the  buses  of  the  memories;  any  pattern  of  access  can 
be  obtained.  As  the  figure  shows,  there  are  also  I/O  buses  which  are 
inked  in  similar  fashion, 

An  initial  system  has  been  running  at  BBN  since  mid-74.  It 
has  operated  with  a range  of  conf  igurat  ions  (up  to  the  14  shown).  A 
basic  design  objective  was  to  create  a modular  series  of  IMPs,  which 
could  be  tailored  to  the  processing  load  of  the  network  node.  Though 
deliberately  designed  for  a specific  application,  the  hardware 
structure  is  quite  general.  It  poses  many  of  the  same  basic  issues 
we  face  on  C.mmp,  and  the  approaches  taken  on  the  PLURIBUS  IMP  offer 
interesting  contrast  points  with  those  taken  on  C.mmp. 


ISSUE  1:  HOU  TO  EXPLOIT  A MULTI -MINI -PROCESSOR 


The  first  issue  is  simply  hou  to  exploit  a multiprocessor, 
since  it  is  a large  systc:..  i r.  terms  of  power,  memory  and  bandwidth. 
It  has  special  structural  chat ac ter i st i cs , which  are  easy  enough  to 
statu,  but  n)t  so  easy  to  translate  into  performance  consequences. 

One  might  say  there  is  no  issue  — simply  use  the  machine. 
But  the  question  is  rot  laid  to  rest  so  easily.  Different 
strategies  of  exploitation  require  that  effort  be  spent  in  different 
ways,  thus  precluding  following  alternative  paths  with  any 
efficiency.  Indeed,  'he  issue  as  posed  makes  it  sound  like  the 
multiprocessor  arrived  sui  generis  with  the  question  of  use  fully 
open.  That  is  not  the  case.  The  exploitation  strategy  is  chosen 
before  the  design  even  begins  and  effects  many  of  the  structural 
features  of  the  syst  .1.  The  actual  situation  is  more  like  making  a 
movie.  Constructing  ihe  hardware  system  is  like  filming.  Using  it  is 
like  producing  the  movie  in  the  editing  room.  The  final  editor  is 
free  to  make  any  kind  of  movie  he  wants,  but  he  must  work  uith  the 
film  given  him  by  the  director. 

There  are  three  main  strategies  for  exploiting 
mu  I t i -m  1 n i -processor  s.  We  take  up  each  in  turn. 


PROGRAM  IT  FOR  A SPECIAL  TASK 

The  first  strategy  is  to  view  the  multiprocessor  as  a 
specialized  device  created  to  do  a specialized  task.  Hardware  and 
software  are  to  be  combined  optimally  to  perform  that  specialized 
task. 


This  in  essence  is  the  stategy  followed  by  the  BBN  group  in 
designing  the  PLURIBUS  IMP.  The  task  existed  ahead  of  time  in  a ue  I I 
defined  form  --  the  AP3A  Net  is  a functioning  system  with  a 
minicomputer  (the  Honeywell  51S  and  316)  as  IMP,  and  much 
experience,  both  statistical  and  qual  i t i at  i ve,  ha9  beer,  gained  uith 
the  requirements  for  an  IMP.  What  was  needed  uas  an  efficient  and 
highly  reliable  implementation  that  could  be  scaled  to  the  task.  All 
this  information  existed  prior  to  design  time,  and  the  9oftuare  and 
hardware  uare  designed  together  in  apparent  total  harmony. 

The  effects  of  this  can  be  illustrated  by  what  is  surely  a 
striking  feature  of  the  PLURIBUS  IMP  --  it  has  no  interrupt!  There  is 
no  way  in  uhich  an  arbitat  ily  occur  ing  external  signal  can  cause  the 
system  to  attend  to  another  task.  Since  the  interrupt  was 
introduced  in  the  late  Fifties,  it  has  been  considered  as  manadatory 
as  I/O  channels.  Abandoning  the  interrupt  is  an  important  design 
dec i s i on. 

The  underlying  rationale  is  very  simple.  The  algorithm 
to  be  programmed  uas  well  understood  and  existed  in  code  form  before 
the  hardware  design  begjn.  Of  tailed  analysis  of  the  program  revealed 
that  it  could  be  partitioned  into  segments  that  never  take  longer 
than  300  microseconds.  Since  the  responsiveness  of  the  system 


fits  the  grain  of  00  usee,  all  processes  can  run  to  completion 
without  interruption. 

Co-equal  with  the  short  program  segments  is  the  necessity  of 
getting  new  tasks  assigned  to  a processor.  If  this  takes  any 
appreciable  fraction  of  300  usee,  then  the  overhead  defeats  the 
scheme.  The  B8N  group  developed  a device  called  a PIO  (Pseudo 
Interrupt  Device).  This  hardware  device  holds  a set  of  numbers, 
corresponding  to  tasks,  which  have  been  given  it  at  arbitrary 
moments.  The  device  instantly  delivers  (and  deletes)  the  highest 
number,  corresponding  to  the  highest  priority  task. 

Interrupts  take  appreciable  time  (e.g.,  for  changing 
processing  contexts),  uhich  is  avoided  by  the  PLURIBUS  HIP,  along 
with  a fair  amount  of  operating  system  code.  In  fact,  the  system 
does  not  have  an  operating  systbm  in  any  general  sense  of  the  word. 
The  necessary  functions  are  distributed  carefully,  such  as  bu  the 

Din  3 * 


This  seems  highly  specialized.  Indeed,  that  is  the  point.  If 
vieued  as  a device  to  achieve  a narrow,  well-defined  total  task,  such 
specialization  is  possible.  Furthermore,  though  ue  know  of  no 
estimates  of  the  gains  made  to  the  PLURIB'JS  IMP  by  such 
specialisation,  ue  estimate  that  they  are  impressive. 

*ts  a final  footnote,  the  motive  of  specialization  does  not 
condemn  the  results  to  be  equally  specialized,  though  that  must  be 
the  fate  of  most  specializations.  But  the  PID  and  the  associated 
concept  of  presegment  in  the  code  into  run-to-complet ion  steps  may 
not  be  of  such  I i m • general  i ty  — though  it  does  pose  an 
interesting  compiler  problem. 


STANDARD  USER  ENVIRONMENT 

The  second  strategy  is  to  view  the  multiprocessor  as 
providing  a standard  user  environment,  much  as  any  other  computer 
does.  Thus,  when  completed  with  operating  system  and  user  facilities 
such  as  file  systems  and  language  processors,  the  system  will  look  no 
different  to  the  user  than  your  local  computation  center.  Only  down 
in  the  boiler  room,  so  to  speak,  will  (h“  multiprocessor  design 
become  apparent. 

Indeed  the  tuo  specialized  flavors  of  multiprocessors  that  do 
exjr.t  in  quantity  are  used  in  exactly  this  way.  One  is  the  use  of 
1 . 0 processors;  the  other  is  the  use  of  dual  central  processors.  In 
both  cases,  they  simply  handle  more  efficiently  the  total  set  of 
tasks  that  has  to  be  done  for  a general  user  shop.  An  interesting 
example  of  this  is  the  CDC  6600  which  has  a large  central  processor 
surrounded  by  ten  miniprocessors.  Uith  few  exceptions  that  ue  knou 
of,  it  shous  up  simply  looking  like  a very  pouerful  general  computing 
system. 


Uith  this  view  the  real  questions  are  the  economics  of 
multiple  smaller  processors  versus  the  single  larger  processor  for 


obtaining  a given  number  of  mips  per 
a i ways  the  question  in  computation,  but 
from  specific  applications. 


do  i ar . That  is  of  course 
here  no  specialization  comes 


nULTIPLE  SPECIALIZED  APPLICATION  SYSTEHS 


The  final  exploitation  strategy  »s  to  view  the  multiprocessor 
as  a system  in  which  a number  0f  specialized  application  systems  will 
be  realized,  both  simultaneously  and  over  time.  Each  of  the 
app  i cat  i on  systems  will  be  adapted  to  the  structure  of  a 
mu ! t i processer  in  order  to  take  as  much  advantage  of  it  as  possible. 

reflects 


Th.s  is  the  view  taken  with  C.mnp.  and  our  discussion 


essent ial ly 

C . mmp • 


the  considerations  that  have  arisen  with  respect 


to 


First  of  all.  this  strategy  leads  to  a general  operating 
system,  Sim  » several  app. icat ions  will  be  running  simultaneously. 

ven  if  we  envision  some  production  mode"  where  one  application 
might  dominate  the  system  for  a period  of  time,  throughout  most  of 
the  life  of  an  application  system,  one  is  coding,  debugging, 
modifying,  developing  and  exploring.  F0r  this  one  neither  needs  nor 
wants  the  entire  multiprocessor.  The  operating  system  of  C.mmp  is 
ca  ed  HYDRA  [wuif.  1 37<* ; ; ue  will  oescnce  some  of  its  features 
after  introducing  another  important  conS . ner a i on. 


If  small  address  spaces  (those  of  the  Pc’s)  are  to  move 
in  large  memory  spaces  (tnat  of  the  million-word  flpl,  then 
be  a memory  mapping.  D.  map  I see  Figure  1)  accomplishes 


around 
there  must 

this  for  C.mmp  and  BCP  (see  F gure  D)  ooes  so  for  the  PLURIBUS  IMP. 
The  important  design  question  .S  tne  nature  of  that  mapping.  An 
attempt  to  build  a general  us*>r  sys’em  ec<4s  to  making  that  mapping  a 
general  demand-paging  scheme.  Thus  all  addresses  go  through  a 
dynamic  process  of  discovering  whether  the  page  is  in  memory  and  if 
not  bringing  it  into  memory,  translating  the  processor  address  into 
the  physical  address  in  tne  memory.  Thus  has  evolved  the  general 
virtual  machine  concept  <n  mooern  computing. 

oiDiicC’mmP  d°ei>  n0t  hdVe  a cJeR,an0‘Pa9ing  scheme  (nor  does  the 
Pl-UHI BUS  IMP).  One  reason  arises  from  the  strategic  view  under 
discussion.  Paging  schemes  are  expensive  (in  time),  more  so  than 
simple  relocation  schemes.  Thus,  to  put  a paging  scheme  in  the 
hardware  is  to  agree  to  make  every  user  pay  this  cost.  At  this  point 
cne  has  eliminated  some  of  the  important  possibilities  fom 
adaptation.  For  not  only  the  cost  goes  up.  but  everyone  will  ue 
subject  to  the  same  paging  system  with  • x s particular  cost  profile, 
whether  it  fits  the  neeos  of  their  particular  system  or  not. 

Thus  what  we  find  in  C.mmp  i?  a so-called  "large  page" 
scheme,  in  which  there  are  only  eight  pages,  earh  large  enough  to 
hold  a substantial  subsystem  (4k  words).  Tne  use  of  this  page  system 
is  then  left  to  the  user.  There  is  a cost,  tor  insofar  as  paging  is 
necessary,  it  must  be  detected  and  executed  by  software.  But  in 
return,  we  obtain  the  possibility  of  fitting  the  paging  system  to  the 
application  system.  That  is,  insofar  as  pages  can  be  left  in  place  we 
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Figure  3:  HYDRA  System  Organization 
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pay  minimal  addressing  overhead. 
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igispisis 

shown  in  Figure  3,  TherP  ip  1 *s  overall  system  organization, 
bottom  part  of  the’fiqure  This  9 K®rne  ,t0  the  system.  shown  in  the 
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iTv.s::1^  , • Vrf ” rils  “ira 

Kernel  Multiprocessing  System  (KMPS  * “and  h lh'9  ,nd'’cated  b« 
manage  the  set  of  naaes  i n rL  rrpc  ’ • by  the  Proccsse9  that 

and  an  i nterpro^^  1/0  ^-em  (I/O), 

called  poiTcy-sys terns,'  ' co-ex?st!* ' “or3""9  ,SySte"s' 

associated  with  it  a nni  . . ” policy-system  has 

about  scheduling  and  paging9  ^d  I ^eTo,"^  SilH^s'"  ^ 

,er?inai  &Z-, 

po  I i cy-sustem  n h ° S“S'™'  he  maU  request  a Particular 

-----  if  be  :ailorcd  ••  «- 


°< « rnr:iw 

rtspeci,,cn?.sk,conjen«t.,he  "lu^syltsnj  £^1^" 

several  places  [Neue I 1 . 1373] . One  of'thcse'effoMs  Tl 

“e  are  PUrSUing  ' nbependent I y o.  any  interest  in  pul , Iprocessors 

version  for  “r  „3re  a 1 s°.  a>  ‘«»P*  '"9  to  construct  a »ul  1 1 processor 

R^e  I "hou;  The  aco„l,,0r  .»*"  inHia'  — °"  systia 

HEARSAY-2  (Lesser  1374]  Th^h  Stru^ture  of  thls  system,  called 
around  a global’  datt  ♦ ? Structure  of  the  system  revolves 

a controlling  process  that  watches  theactvity  of  the 


SEMANTIC 

PROCESSES 


n. 


INDUCTION  MODEL 

Data  Directed 
Information  Gathering 
Hypothesize  and  Test 
Parallel  and  Independent 
Deactivation  Simple 

Figure  4:  HEARSAY- 2 Conceptual  Structure 
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various  knowledge  sources  and  uses  the  collectively  weighted  guesses 
to  eventually  understand  an  utterance. 

The  HEARSAY  structure  lends  itself  to  a direct  decomposition 

^■♦Pa:a,,a'r  P:°"SSeS  t0  take  edvanta9e  of  a multiprocessor 
hl tecture.  Each  knowledge  source  can  be  a seperate  process  and  in 

many  cases  multiple  copies  of  a given  knowledge  source  can  be  used. 
Having  several  sources  of  knowledge  simultaneously  working  yields 
significant  improvements  in  the  time  it  takes  to  recognize  an 
utterance.  This  is  particularly  important  since  the  ultimate  goal  of 
such  systems  is  to  recognize  speech  in  real  time.  The  HEARSAY 
structure  thus  allows  for  effective  use  of  a closely  coupled 
multiprocessor  where  a large  common  data  base  can  be  easily  accessed 
by  a large  collection  of  processes. 

In  a large  system  like  HEARSAY,  it  maybe  necessary  for 
special  schedu I mg  algorithms  or  paging  strategies  to  be  employed. 

or  example,  the  initial  operating  system  built  on  top  of  HYDRA  does 
not  provide  for  priority  classes  in  scheduling.  The  large  collection 
of  processes  the  make  up  the  HEARSAY  system  may  very  well  need  to  be 

PvnoAetV°  0b!?’m  the  d^ired  effects.  The  important  point 
. •+uth!u  .R?.  d0es  allow  for  another  operating  system  to  co-exist 
HFARRAV  ♦ °ne  that  m°St  UBers  wiM  use-  Thus,  “e  expect 

HYDRA  t0  d6Ve  °P  and  USe  lts  own  °PRrating  system  built  on  top  of 


CONCLUSION  - STRATEGIES 

It  is  important  to  realize  that  of  these  choices,  no 
part.cu  ar  one  is  "right".  Each  is  an  attempt  to  maximize  the 

payoff  for  specific,  but  different,  goals.  The  first  choice,  that 
aken.  Py+the  PLURIRUS  IMP,  attempts  to  maximize  the  efficiency  and 
re  ia  i i y for  a specific  task.  With  the  second  choice,  that  of 
building  a general  computing  environment,  one  is  trying  to  find  the 
most  efficient  way  to  construct  a certain  environment.  If 

mul  t .processors  can  compete  for  that  environment,  they  can  be  an 
implementation  of  choice  and  mul  t i processers  should  be  designed  to 
meet  that  demand.  The  third  choice,  that  taken  by  C.mmp,  attempts  to 
e advantages  of  specialization  but  over  an  unknown  range  of 
application  systems.  It  must  necessarily  trade  off  some 

possibi I I ties  of  specialization  against  a system  that  can  handle 
several  such  applications  simultaneously.  Similarly,  it  must  trade 
off  the  best  scheme  for  general  computing  in  order  to  permit 
adaptations  to  occur. 

Nor  are  the  choices  mutually  exclusive  in  the  sense  that  if 
you  choose  X you  are  precluded  from  the  same  app  I '•  cat  i ons  that  choice 
r permits.  Speech  systems  will  be  brought  up  on  general  purpose 
systems.  (We  are  creating  a version  of  HEARSAY-2  on  our  PDP10.)  We 
will  be  creating  a general  user  environment  on  C.mmp,  which  will  run 
simultaneously  with  our  work  in  speech.  And  we  would  certainly  not  be 
surprised  to  see  the  PLURIBUS  IMP  used  for  other  applications  quite 
remote  from  the  message  processing  task. 


I 


The  choice  of  st 
a given  system.  It 
consistently  so  that  the 
some  dimensions. 


ISSUE  2.  HOW  TO  GET  ALL  THE  SOFTWARE 

'he  second  major  issue  is  how  to  obtain  all  the  software  that 
is  needed  for  such  a system.  By  now  we  are  all  aware  that  it  takes 
an  immense  amount  of  software  to  make  a computer  system  livable.  In 
practice  such  software  only  arises  with  the  development  of  a large 
and  active  user  community,  plus  the  continued  efforts  of  the 
manufacturer  over  several  years.  No  general  preaching  on  this  fact 
should  be  necessary  in  a minicomputer  user  community  where  new 

systems  arrive  from  the  manufacturer  rather  bare,  despite  advert izina 
claims. 


* 

r 


The  mul t i -m i n i -processor s are  composed  from  existing  minis 
(C.mmp  from  an  extensively  used  system,  the  P0P11;  PLURIBUS  IMP  from 
a new  machine,  the  Lockheed  SUE)  and  programs  exist  fo~  these  minis 
as  stand-alone  systems  (many  for  the  PDP11,  fewer  for  the  SUE).  Vet 
these  do  not  go  very  far  toward  satisfying  the  need.  First,  all  such 
systems  must  be  reconditioned  to  work  in  a multiprocessor 
To  do  this  in  a way  that  exploits  the  multiprocessing 
system-programming  problem.  But  further,  these 
are  big  systems  with  big  memories  and  they  can  use 
software  systems  commensurate  with  that  power.  All  this  adds  up  to  a 
major  problem. 


env i ronment . 
i s a genu i ne 
mu  I t i processor  s 


There  are  several  approaches  to  obtaining  the  software.  Even 
more  than  with  the  strategy  of  exploitation.  these  are  net  mutually 
exclusive.  In  fact  they  form  an  o.matorium  and  all  should  be  used 
(and  pretty  much  are  on  C.mmp). 


CODE  IT  IN  ASSEHBlY  LANGUAGE 


The  first  approach  is  (o  use  the  minimal  tools  provided  by 
the  manufacturer.  BBN  has  implemented  the  PLURIBUS  IMP  in  this  way. 
A simple  assembler  was  used  for  at'  programming.  A straight-forward 
loader  was  used  to  transfer  code  to  the  machine.  And  finally,  a 
relatively  simple  debugging  system  was  used.  The  debugging  system 
had  no  multiple-process  capabilities.  if  the  amount  of  software 
that  must  be  produced  is  small,  this  is  certainly  the  quickest 
appr ouch. 


CODE  IT  IN  A HIGH  LEVEL  LANGUAGE 

The  main  approach  used  by  the  implemented  of  HYDRA  for  C.mmp 
was  the  use  of  a high  level  language  --  BLISS  (Uulf,  1971).  BLISS  is 
an  ALGOL- 1 ike  system  implementation  language  which  is  available  for 
both  the  PDP10  and  the  PDP11.  It  has  an  optimizing  compiler  that 
produces  object  code  which  in  some  cases  is  better  (more  efficient) 
than  code  produced  by  a system  programmer  using  assembly  language. 
For  larger  software  systems,  it  is  desirable  to  use  a high  level 
language.  Such  a language  usually  allows  for  much  greater  programmer 
productivity  and  for  systems  that  are  much  easier  to  assimilate  and 
maintain.  in  conjunction  with  the  high  level  language,  one  generally 
finds  more  elaborate  relocating  loaders  and  debugging  packages.  The 


BLISS  debugging  package,  called  qryi?  ,, 

of  multiple  processes,  and  fopX1  3 °Wf  °r  syr,bo  1 ' c debu99 ' ng 
debugging.  source  language  routine  level 

only  is  the^per^ing^ysi'erl  runn i ng^bur^^h^  approach-  N°* 
which  show  the  productivity  of  the  HYDRA  S0,"e  measurements 

9ood,  both  in  terms  of  number  of  Li  ^ Software  team  to  be  very 
man-month  and  in  terms  of  the  number  o^SS^u^e^nu!"" ' ‘ °nS  P6P 

COUPLE  TO  LARGE  COMPLETE  MACHINE 

the  software  development  ^can*  be^ao^  ^ ^ mu  1 * * ~n ' n ' • much  of 

faci  I i ties  of  the  large  computer  r S'n9  ^ COnveni^t  user 
elaborate  linking  loaders,  a f i | e ^us^Th"^' cr°9s-comP:  'ers. 
a|l  be  implemented  and  used  more  easilu  on  "d  s ' mu 1 at 1 on  Packages  can 
m 1 n i • The  HYDRA  development  9 , arye  computer  than  on  a 

with  a PLP10  that  is  connected  tn  r ° 3 T°f  these  capabilities 
greatly  simplifies  tht  problem  nf  '"'w'  Thus  a large  C0,nPuter 
especially  true  in  the  o i . °*  software  development.  This  is 

under  development  and  ^ StiM 

some  of  the^oftMeri°c"nk'be  ^ladL^r™"'"!!  'ar£le  co"pu,er  that 

existing  systems  on  a"°!t'?'hrr  by  using 

advantage  of  the  file  system  on  thP  I ' instance,  one  can  take 

the  multi-mini.  I f the  mu  U i -mtn  T in  1 'eu  «>f  one  on 

could  be  transferred  between  the  t ' 3 ®°  haS  3 flle  system,  files 

large  computer  become  more  direct  I ^access 'h  I facMities  used  in  the 
another  example,  there  are  nn  • 9 CCassible  to  the  multi-mini.  As 
compiler  on  C.mmp.  Use  of  a BL  I 1 a ^ ® p|ans  to  create  a BLISS 

*2  - - SSttzui 

'0  .ha,  such  a project  „ completely  oT,™;  crUicat  so^We"^' 
GOOD-KERNEL  HYPOTHESIS 

the  kerIeTo,iS,heanpperat!„r,s,S,:d'  h“po,hesls  'ha'  it  one  bu, Ids 
operating  system  facilities  ^nd^aoo  I C°^eCt ' y>  the  h,gher  level 
easier  to  implement.  To  some  extent  ^YDRA * ' °n  proyrams  become  much 
hypothesis.  If  the  HYDRA  k t . Hi^DRA  ,s  the  first  test  for  this 

facilities,  building  opera  tL'st  35  ther,yht  Set  of  bas - 

simple  compared  to  buildina  the  6mS  °n  t0P  °f  ' * should  be  very 
machine.  Pu.ld.ng  the  same  operating  systems  on  the  bare 

The  implemer tat  ion  of  a fi|P  cucto. 

Phenomenon.  HYDRA ’s  Globnl  c1IB1h„i  t !,  y 15  an  example  of  this 
d i rectory  structur^  a , Tots  S t o C°n>'JnC"°"  W'th  a • ■ »P  ' e 

Simple  Objects.  In  order  to  build  a ?He  s,n,Per"°nant  St°ra!|e  °' 

!'*  P'W  needs  '»  fo'ine  a ne„  object,  cal  led'  " r,  ?e" “’VV"  ’ ‘ ’ °"e 
the  representation  of  the  file  rhD  a , f,le  ’ wh,ch  contains 

file.  The  directed  graph  structure  of  the 
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Figure  5:  L*  Facilites 
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^^*'!rNhrr4u’«l™,Werir^le,|  h!?;archica'  ,ile  structures. 
Machine  is  a very  t i me  consum^ngCpr"ect ! ' G ,ba  b-a 


interactive  symbolic  implementation  system 


providesAr'ota,aonerMLie  -°  6"  * "P t i on  system 

application  sys, eT"!;r^:;renu  “ 'hi"  “hiCh  °"e  ca"  ■>“'  'he 

such  a system,  called  L*  INeue  l‘  1371]  Ion  I"”  '‘*per'"en*  in9  uith 
s illustrates  the  kinds  of  faciliti!!  *f°+  sevsra  years.  Figure 
user.  The  system  has  a kernel  w ! tnat  3re  available  to  an  L,y 

initial  I anguaqes  which  are  ^ • II  also  provides  three 

facilities  (e"g  edUrrs  debuooere0  '*"■?'*'  3 ,ar9e  set  of  ^ 
features  of  Lv,  that  make  it  at  tract  i ie^re*'  ' ^ ’ aSSembler)<  The  maJ°r 

1}  L*  ia  3 t0la‘  environment  so  that  one  need  not  relu 
on  other  systems  to  produce  the  required  software 

°f  ,he  ,a^* ia  •«"« 

41  »Tanipu?It?„^,  in,erac,ive  a™*  puilt  around  a symbol 
manipulation  language.  y 

°^ho:no:guhhoneheiia"uVr:;no9  as  *? a iika  ">*«e«i..  i», 

programming  features.  9 impl.msntmg  extremely  ba  sic  systems 


197A  IRobsr^son^WAl’.  'Tslall  "[l"  L'\°"  C-"»p  during  Apr  i I . 

betusen  ons  and  three!  mas  given  the  task  of h9"!,™0”  Ivaru'm9 
pre-processor  that  would  tai,0  ; * . ask  of  bu,ldln9  a speech 

analyzer,  sample  the  data  oplTt  t°m  3 real_time  audio  spectrum 
segments;  the  result  to  he  ? ' * .'nt0  phoneme5*  and  label  the 

display  processor^ee  Figure  T Ve  Z ° " 3 9raphi- 

graphically  illustrate  the  rmp.h'  ! demonstration  was  intended  to 
to  the  multiprocessor  :gM:«:P  ed'UP  "°re  “*'»  added 


run  the  L.v  sys tem'unde^HYORA ^but "°n,h'  Ue  ini,ia"y  “anted  to 
not  yet  ready  for  md  userf  u T*  U°ab'B  ,0  because  HYDf,A  “as 
tor  C.mmp  ant  „"h“n  ,£?. C°ns  ruc,'d  a stand-alone  L.  system 

designed  an  interface  belt..!!  Bliss’3"911  ?B"°ns,a,io'’’  Ue 

segmenter,  and  labeler  could  be  “r  i tten^  i BL ISS  2 ?*  ampler, 

specialized  mu  1 1 i orocess  inn  wryten  in  BLISS.  Ue  also  built  a 

character  i s t i c that  it  w-ie  n f ^ uhlcb  bad  the  unusual 

end  result  of  the  experiment  ^ ° mu 1 * 1 pr°9ramm ' n9  system.  The 

pre-procecsor.  The  important  point  iHhat  fta'*  mU'tip;°Cess  speech 
was  produced  in  a short  Deriod  nf  V*  ' 96  amount  of  software 

advantages  o,  a good  i I i^mp^r  ^ 


1 

J 


CONCLUSION  - SOFTWARE  PRODUCTION 


All  the  various  approaches  to  producing  requisite  software  on 
a mu)  t.-mm, -process^  are  important.  For  each  approach  there  is  a 
set  of  software  tasks  where  it  is  better  than  any  of  the  other 
approaches.  It  is  thus  important  that  the  multi-mini  environment  be 
ich  enough  to  allow  the  use  of  any  or  all  of  these  approaches. 


g 


ISSUE  3:  SflALI. -ADDRESS  PROBLEM 


20. 


memory  in  a multi-mini  with  the  sma  I h ' 96  °f  the  'arge 

mini  • On  C.mmp,  ft-  instance  bJk  ' P " 'C3'  address  space  of  a 
only  32 K words  while  the  primaru  memnr  'r°CeSSOr  :an  d,rect'y  address 
^1  I ion  words.  Anuone  who  , T U C3P3C’ty  of  thp  Astern  is  one 

is  forced  to  make  serious  design  ^er  pr°gfflm9  ‘arger  than  32K 
effort.  Furthermore,  since  thG9  Ia-  °ns  about  h's  programming 

^Plications  will  be  atJracted  o the^mu  I ”'emo?y  *•  available.  large" 
occur  frequently.  A larne  numhp  h « """n'  3nd  the  problem  wi|1 

and  attemoted.  For  small  or  we  II  ha;e  been  suggested 

solutions  that  work  well.  However  L5'  6?5,  there  are 

programming  system,  all  the  sol„tlnn«  3 large*  dWnamic 

either  restrictive  or  cosily  "ov^d  ’ '°  ”°  ,ar  «"*  «•  he 

STATIC  OVERLAYS 

overlay  pages  us  i nq^e^ocat  ■ S°  * Ut ' °n  t0  this  problem  has  been  to 

the  latest  version  of  L*  bo^us^th!  P Tr  °"  HVDRA  and 

always  be  in  the  same  nlare  . solution.  The  overlays  must 

overlay,  contain  ”;ydaP;rj“'uro:d0reSS  ,ranSlS,i°"’  ><  the 

translation  problem.  However  ? Ms’  'here  I,  no  andress 

data  which  contains  .any  addr^s  "eerXel  ZZZ' " "9  SUS,e“5  ha*a 
overlayed  statically  „r  i„cUr  address  tr^sf,?  "ust  ei,her  bo 

rase.  code  payee  always  face  the  olob,  ™ '?*“•  ,n  an« 

“°rk  0ar  t i cular  I y for  small  or  weM-behaved'syl^s . " ’° 

systems  where  code'm  an'overlau  in'"’'  S°"'e  larrJe  Programming 

another  overlay  which  must  rllh, attempts  to  access  data  or  code  in 
Static  overlays  proven,  bil'  l6  ha  s“a  physical  addresses, 
problem  with  this  solution  Is  th.  k9  C°rreC"U-  Tha 
relocation  registers.  Laroe  q,  ovt "head  incurred  in  changing 

relocation  registers  frequentlu  thu”3  l,kely  to  want  to  change 

frequent  I y,  thus  making  the  overhead  critical. 

POSITION  INDEPENDENT  OVERLAYS 

Another  solution  similar  to  thp  tr- 
over lays  that  contain  oositinn  inn  * ,th  previous  or>e  is  to  use 
you  from  the  restriction  l ? ,nc,ePencJent  code  and  data.  This  frees 

Physical  location.  The' ^pr  fmaru'brobl  ° ' 09  I*'  mt°  the  San,e 

Position  independent  code  and  oata  ten  to“  ak  t””'1'1'0"  ' 5 ,hat 
more  slowly,  and  be  harder  to  produce  The  *?kf-  space*  execute 
underlying  processor  in  the  muMUmlni  aNn  sa  a"°"  assumes  that  the 
Pos 1 1 i on  independent  code  (as  does  the  mu’.  ' l0lJ  cost 


SOFTWARE  DEMAND-PAGING 


Ano  t her 

demand-pag i ng. 


kind  of  solution  is  to  provide  software 
mvolves  accessing  data  with  "fat  addresses". 


e.g.,  32  bits  of  effective  address  that  determine  a page  and  an 
offset  within  that  page.  It  also  involves  making  any  code  that 
crosses  page  boundaries  go  through  both  demand-paging  and  address 
translation.  Ue  tried  this  solution  in  an  early  version  of  LiV.  This 
solution  takes  more  space  and  has  some  position  independent  code 
which  is  harder  to  produce.  It  also  suffers  from  severe  overheads  for 
the  demand-paging  and  address  translation.  Its  great  virtue  is  that 
it  provides  a true  I y general  system;  one  where  the  entire  large 
memory  is  directly  addressable. 


niXED  SOLUTIONS 

The  most  promising  approach  appears  to  involve  mixing  several 
of  the  previous  solutions.  There  are  two  basic  ways  to  do  this.  One 
method  would  start  with  a software  demand-paging  system  (fat 
addresses)  and  allow  for  some  pages  to  be  specialized  (i.e.,  have 
small  addresses  for  efficiency).  The  other  method  would  start  with  a 
static  overlay  structure  and  build  up  mechanisms  to  allow  for 
arbitrary  virtual  addresses  (pseudo-fat  addresses) . Ue  are  currently 
working  with  the  latter  of  these  methods.  Ue  have  constructed  the 
latest  L*  system  with  a simple  static  overlay  stricture.  Ue  are  now 
in  the  process  of  designing  the  mechanisms  necetsary  for  certain 
facilities  (e.g.,  compiler)  to  be  able  to  access  arbitrary  virtual 
addresses.  The  main  problem  now  is  to  determine  whether  a large 
application  system  like  HEARSAY-2  can  be  decomposed  into  small 
overlays  that  don’t  need  direct  access  to  each  other  except  on  rare 
occasions.  It  remains  to  be  seen  whether  this  solution  is  adequate 
for  large  systems. 


ISSUE  4:  PROTECTION 


The  forth  major  issue  is  protection.  The  style  and  amount  of 
protection  in  a multi  mini  operating  system  depends  greatly  on  which 
strategy  is  chosen  for  exploiting  the  multi-mini. 


NO  PROTECT i ON 


The  PLURIB'JS  IflP  represents  one  end  of  the  spectrum.  The 
decision  to  build  a special' zed  system  for  a specialized  mult  -mini 
led  BBN  to  provide  r,o  protection  at  all.  The  assumption  here  is 
that  there  is  only  one  application  system  running  and  it  was  built  by 
a small,  closely-knit  programming  group.  Every  module  in  the  system 
must  be  aware  of  other  modules  and  their  conventions  so  that  they 
don’t  get  in  each  others  way.  There  is  also  ar,  assumption  that  the 
system  is  small  enough  to  be  easily  debugged.  A module  with  a bug 
coild  accidentally  destroy  another  module.  If  there  are  too  many 
modules  (or  they  are  too  complex),  it  may  be  very  difficult  to  find 
the  incorrect  module.  This  solution  generally  works  well  onlu  for 
small  and  well  understood  applications. 


AUTHORITY-BASED  PROTECTION 


At  the  other  end  of  the  spectrum,  the  decision  to  allow  many 
(unknown)  applications  to  run,  with  program  development  occurring 
simultaneously,  leads  to  great  concern  with  protection,  both  net ^een 
users  and  between  various  processes  being  run  by  one  user. 
Protection  can  be  viewed  as  a central  issue  of  operating  systems; 
i.e.,  the  control  of  resources,  the  distribution  of  the  rights  to  use 
these  resources  to  various  processes  on  a nomen t- to-momen t basis,  and 
the  guaranteeing  of  these  rights.  Host  of  the  first  and  second 
generation  operating  systems,  such  as  the  existing  DEC  systems 
(TQPSTEN  and  TENEX)  and  the  IBM  OS3G0,  ore  so-called  authority-based 
systems.  In  these  systems,  protection  is  associated  with  the  data 
and  not  with  the  processes  accessing  the  data.  This  tends  toward 
crude  categorization  of  protection  (e.g.,  the  familiar 
r ead/wr i te/read-wr i te  distinction).  There  are  currently  no 
multi-mini  systems  that  use  authority-based  protection,  although  it 
is  clearly  the  alternative  that  would  have  been  used  a few  years  ago. 


CAPABILITY-BASED  PROTECTION 

HYDRA  is  a capability-based  system,  which  means  that  it 
associates  rights  to  use  resources  with  individual  processes  and  not 
with  individual  data.  Capability  systems  permit  much  finer 
distribution  of  rights,  essentially  on  arbitrary  processes.  Lie  are 
only  just  beginning  to  see  capab i I i ty-baj ed  operating  systems,  and 
this  aspect  of  HYDRA  represents  an  independent  research  effort. 

HYDRA  has  an  abstract  view  of  the  entities  that  can  be 
protected  and  the  rights  to  manipulate  these  entities.  Because  of 
this*  it  is  possible  to  build  higher  level  protection  into 
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specialized  subsystems, 
that  reflects  the  basic 
application  systems. 


This  is  another  aspect  of  the  HYDRA  design 
exploitation  strategy  of  multiple  specialized 


There  is  a subtle  disadvantage  with  capability-based  systems 
that  ue  are  learning  the  hard  way.  You  general  ly  must  do  much 
planning  in  a session  to  insure  that  you  ui  i I I have  all  of  the 
capabilities  you  need.  If  your  program  has  a strange  bug  and  you 
den  t have  the  proper  'ights  or  capabilities,  you  may  not  be  able  to 
explore  the  bug.  At  this  point,  we  lack  the  experience  with  HYDRA  to 
kne  ahetner  the  advantages  of  such  a protection  system  outweigh  the 
d i sadvantages  or  not. 


Generally,  there  is  a computing  cost  associated  with 
protection  and  the  more  protection,  the  higher  the  cost.  This  leads 
the  user  of  an  over-protected  system  to  find  ways  of  avoiding  the 
protection  mechanisms.  However,  with  an  under-protected  system,  the 
user  tends  to  lose  much  work  when  something  that  belongs  to  him  is 
destroyed  by  someone  else.  He  also  tends  to  lose  time  trying  to 
d°bug  complex  systems  when  the  various  parts  of  the  systems  are  not 
protected  from  each  other.  Finding  the  correct  balance  of  protection 
is  both  important  and  difficult,  and  we  expect  this  issue  to  become 
more  visible  as  ^.mmp  gets  more  and  more  ust. 


ISSUE  5:  TINE-CONSTANT  PROBLEM 


The  fifth  major  issue  has  to  do  with  the  time  constants  for 
basic  fur~cions  that  must  be  performed  on  any  mul  t i -m i n i -processor . 
This  is  actually  a class  of  problems,  one  for  each  application  system 
against  the  pattern  of  bas  c time  constants.  For  this  reason  we 
cannot  enumerate  general  alternatives,  but  must  select  illustrative 
issues  that  arise  for  particular  examples.  The  important  point  is 
that  the  time  constants  have  an  immense  influence  on  programming 
style  and  system  design. 

Consider  two  basic  ways  of  building  a large  programming 
system:  1)  have  one  process  that  has  many  overlays  and  does  a great 
deal  of  relocation  register  changing;  or  2)  have  many  sma  I I processes 
that  communicate  with  inter-process  communication  and  don’t  ever 
change  relocation  registers.  In  HYDRA,  relocation  register  changes 
are  about  an  order  of  magnitude  faster  than  inter-process 
communication,  so  the  correct  choice  is  the  first  way.  Functionally, 
many  intercommunicating  processes  may  be  the  preferred  way  to 
organize  the  system,  but  the  time  constants  preclude  it.  The  time 
constants  may  have  more  impact  on  design  decisions  than  the 
functional  characteristics  of  the  operating  system. 

Another  example  of  this  problem  has  to  do  with  prevention  of 
deadlocks,  a pervasive  problem  in  all  multiprocessor  systems.  The 
HEARSAY  system  wishes  to  have  a large  data  base  shared  between  many 
processes.  In  order  to  prevent  a process  from  having  the  data  it  is 
working  with  changed  by  another  process,  semaphores  are  used  to  build 
lucking  structures  around  relatively  small  pieces  of  code. 
(Semaphores  are  a standard  device  to  avoid  interference  between 
processes;  they  are  flags  that  indicate  whether  an  object,  e.g.,  a 
piece  of  code  or  a piece  of  data,  is  in  use  by  another  process.)  The 
problem  is  that  the  operations  on  semaphores  that  HYD  M provides  are 
much  too  slow  relative  to  the  frequency  of  use  and  size  of  code  they 
are  locking.  Because  of  the  time  constant,  we  had  to  build  another 
level  of  semaphore  that  would  only  make  use  of  the  HYDRA  semaphore  on 
rare  occasions.  This  is  an  example  of  a f unc t i ona I capab i I i t y that 
could  not  be  used  because  of  the  time  constant  problem. 

Another  place  where  the  time  constants  become  critical  is  in 
real-t  me  application?.  The  basic  functions  like  context  swap, 
relocation  register  change,  inter-process  communication,  and 
interrupt  handlinj  take  much  more  real  time  on  a mini  than  on  a large 
computer.  The  difference  can  be  attributed  *o  the  differences  in  the 
raw  processing  power  and  in  the  complexity  of  instruction  sets.  When 
these  di  f ferences  are  taken  into  account,  the  relative  overhead  on  a 
mini  and  a large  computer  are  about  the  same.  However,  the  real-time 
overhea  ■'  becomes  critical  when  doing  real-time  computation,  or  when 
minimizing  terminal  response  time. 

The  time-constant  problem  thur  conies  down  to  understanding 
the  time  cons  _ints  and  their  relationships  with  respect  to  a given 
application  system.  This  understanding  is  necessary  if  the 
application  system  is  to  make  effective  use  of  the  multi-mini.  The 
problem  exists  in  all  computer  systems,  of  course.  But  it  is  much 


more  critical  in  systems  with  highly  flexible  and  generai  operating 
systems  (such  as  HYDRA).  Such  operating  systems  provide  functional 
capaoi  I i ties  of  great  power  and  elegance,  but  the  time  constants 
often  deny  their  use.  The  situation  is  especially  critical  in 
mul  t (processors  where  exploitation  of  the  system  requires  working 
with  many  processors  in  some  coordinated  scheme.  This  can  only  be 
done  by  working  through  the  operating  system.  It  is  almost  impossible 
in  a multiprocessor  to  avoid  the  time-constant  problem  by  withdrawing 
to  your  own  world  to  avoid  interaction  with  the  operating  system. 


ISSUE  81  RELIABILITY 


The  sixth  major  issue  is  total  system  reliability. 
Mu  I t i -m  i n i -processors  are  complex,  and  much  can  go  wrong  in  both 
hardware  and  software.  Also,  the  hardware  that  provides  for 
multiprocessing  provides  redundency,  which  if  exoloited  can  permit 
more  flexibility  in  recovering  from  hardware  failures.  Because  of 
these  factors,  reliability  plays  an  important  role  in  all 
multi-minis.  There  are  several  known  approaches  to  the 
reliability  prob I e:n. 


CRASH  AND  DUMP 

The  most  common  approach  in  existing  large  computer  operating 
systems  is  to  bring  the  system  to  a grinding  halt  when  a failure  is 
detected.  The  system  is  usually  dumped  at  this  point  so  that  system 
maintainers  can  try  to  determine  what  caused  the  failure.  Then  a 
fresh  copy  of  the  system  is  brought  up.  The  obvious  flaw  with  this 
strategy  is  that  all  users  lose  their  current  run,  even  if  the 
failure  would  not  have  otherwise  affected  them.  This  approach  is 
slowly  disappearing  as  more  experience  is  gained  with  smooth  recovery 
from  f a i I ures. 


AMPUTATION  AND  EXTERNAL  BACKUP 

The  PLURIBUS  IMP  stresses  reliability  as  its  most  important 
attribute.  Their  system  is  highly  modular  and  redundant.  Every 
structure  in  hardware  and  software  is  isolated  and  dup I i cated.  The 
system  makes  periodic  validity  checks  and  amputates  any  structure 
that  appears  suspicious.  If  the  amputation  causes  some  data  to  be 
lost,  an  external  backup  provides  the  data  to  be  dealt  with  again. 
The  interact  ion  between  an  IMP  and  the  ARPA  Net  involves  much 
handshaking,  When  data  is  accepted  by  an  IMP,  it  acknowledges  the 
reception.  If  no  ockr  wiedgement  is  received  within  a certain  time 
frame,  the  data  is  sent  or  ;n.  In  this  way,  data  is  distributed 
across  al  I IMPs  in  the  network.  Thus,  the  specialized  nature  of  the 
application,  in  this  case  the  ARPA  Net,  provides  an  external  backup 
for  lost  data,  no  matter  what  the  cause  for  the  loss.  This  permits 
good  solutions  to  the  local  reliability  problem. 

The  reliability  of  the  PLURIBUS  IMP  is  so  high  that  the 
first  time  the  system  was  ever  brought  up  they  discovered  that  the 
only  way  to  stop  it  was  to  pull  the  plug  on  the  whole  system.  Since 
then,  their  system  has  grown  to  be  one  of  the  most  reliable  known  to 
us . 


RECOVERY  BY  RECONSTRUCTION 

The  nature  of  reliability  on  C.mmp  and  HYDRA  is  quite 
different  but  still  very  important.  The  stress  in  HYDRA  is  on 
recovery  after  a failure  has  been  detected.  C.mmp  does  not  have  the 
kind  of  backup  that  PLURIBUS  IMP  has  with  the  ARPA  Net.  The  method  in 
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in  1 in  ° " m am  3 9 oba  symbo I table  (GST)  which  contains 
1 °°  ab0^  p/epy  structure  in  the  system.  The  GST  is 
aintamed  so  that  any  destroyed  structured  can  be  recreated 
including  parts  of  the  GST.  To  detect  f a i I ures.  .he  hardware  has 

cherk  m0  Ied,f°  d°  PaHty  checkin9  and  the  software  maintains 
checksums  of  all  critical  structures.  In  addition,  whenever  an  error 

Assert  hlru  ? Z runn ' n9  a user  system,  the  error  information  is 
pcissed  back  to  that  system.  Thus,  the  end  user  can  build  reliable 

'r  :°n  sy^ems  HYDRA’S  reliability  is  still  under  research  and 
ts  success  has  not  been  fully  determined. 


PARTITIONED  SYSTEMS 


par  t i t i 
concurr 
har dwar 
par  t i t i 
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Another  aspect  of  re  I i ab i I i ty  inC.mmp  is  the  ability  to 
on  the  system  into  several  smaller  systems.  This  allows 
ent  system  development.  general  user  facility  use.  and 


e mamtance  and  development.  The  PLURIBUS  IMP  can  also  be 
oned  but  only  by  recabling.  The  C. mmp  par t i t i oni ng  i« 
ed  by  switches  and  is  changed  on  very  short  notice  (a  couple 
tes).  This  ability  is  being  used  on  a day-to-day  basis  to  aid 
ecting  a stable  configuration  of  C.mmp.  Ue  also  use  it 
tly  to  a I low  several  groups  to  work  independently. 


ISSUE  7:  PERFORMANCE  EVALUATION 


The  last  major  issue  is  how  to  analyze  and  evaluate  the 
performance  of  running  systems  on  a mul t i -m ini -processor . This  issue 
is  perhaps  the  least  understood  of  all  of  the  issues.  Programmers  are 
notoriously  wrong  in  guessing  what  their  programs  are  actually  doing 
and  where  the  time  is  really  going.  There  is  reason  to  believe  that 
on  a multi-mini,  the  problem  is  going  to  be  much,  much  worse.  The 
decomposition  of  algorithms  to  take  advantage  of  parallel  processing 
is  currently  a rich  research  field.  Imagine  how  difficult  it  will  be 
to  determine  the  dynamic  characteristics  of  several  cooperating 
para  I lei  processes. 

I rad  i t i ona  I I y,  the  analysis  of  performance  of  a computer 
system  or  a program  is  undertaken  as  a study.  Often  this  study  is 
primarily  of  academic  irterest,  though  sometimes  with  a view  to 
balancing  the  computer  system  or  making  the  algorithms  run  more 
efficiently.  However,  we  believe  that  for  multiprocessors  there  will 
be  a major  shift  in  emphasis  of  performance  evaluation  from  analysis 
tools  to  operational  tools.  They  will  become  as  important  to  a 
multiprocessor  user  as  the  traditional  debugging  tools. 

The  solution  to  the  problem  on  C.mmp  has  been  to  start  a 
research  project  on  a hardware  device,  called  a hardware  monitor 
[Fuller,  1973],  which  will  allow  us  to  measure  specified  kinds  of 
activity  on  one  processor’s  bus.  This  device,  used  in  close 
conjunction  with  software  in  HYDRA,  should  give  the  user  a chance  of 
obtaining  the  dynamic  job  statistics  he  needs  to  analyze  the 
performance  of  his  programs.  He  also  hope  to  use  the  device  to  help 
understand  the  real  performance  characteristics  of  HYDRA  in  order  to 
improve  system  performance. 

An  example  within  HYDRA  illustrates  the  use  of  the  hardware 
monitor.  Ue  have  a real-time  device  that  connects  C.mmp  to  the  PDP10. 
Ue  recently  discovered  that  characters  we~e  occasionally  being  lost, 
presumably  because  HYDRA  was  running  blind  to  interrupts  for  too 
long.  Ue  were  able  to  verify  this  with  an  oscilloscope.  However, 
we  have  not  been  able  to  find  tbr  code  in  HYDRA  responsible  for  the 
excess  blind  time.  Ue  expect  rhe  hardware  monitor  to  be  able  to 
isolate  the  offending  code.  The  important  point  is  that  a multi-mini 
is  so  complex  that  new  techniques  must  be  developed  to  aid  in 
performance  evaluation. 


CONCLUSION 


Though  other  programming  issues  could  be  discussed,  seven  is 
al  I anyone  should  be  called  upon  to  remember.  Let  us  sum  up. 

Ue  believe  that  mu  1 1 i -m i n i -processors  such  as  C.mmp  and 
PLURi3US  HIP  will  come  to  provide  a substantial  amount  of 
computational  power.  Although  the  technical  capability  for 
creating  multiprocessors  has  existed  for  quite  awhile,  only  with  the 
development  of  the  minicomputer  (and  now  the  microprocessor)  has  the 
cost-benefit  structure  pointed  to  mu  I t i processors  as  an  important 
technical  solution.  As  a result  we  know  almost  nothing  a*  this 
point  about  the  actual  programming  and  use  of  genuine 
multiprocessors,  i.e.,  those  where  the  multiprocessor  st'ucture  is 
sufficiently  general  and  available  to  affect  the  structure  of 
application  systems. 

Several  of  the  : ssues  we  have  discussed  in  our  list,  e.g., 
how  to  get  all  the  software,  protection,  reliability,  and  performance 
analysis,  are  well  recognized  problems  and  are  subject  to  intensive 
independent  research.  The  work  with  multiprocessors  gives  them  a new 
twist,  however,  raising  to  consciousness  aspects  that  are  of  little 
interest  ir  other  kinds  of  systems.  Though  still  speculation  on  our 
part,  performance  analysis  as  a real-time  dynamic  debugging  tool 
represents  a new  world. 

Two  items  on  our  list,  the  sma  I I -address  prcolem  and  the 
time-constant  problem,  do  not  represent  areas  that  are  well  explored 
in  computer  science.  Ue  have  seen  no  solutions  to  the  underlying 
programming  issues  in  the  literature.  Both  items  seem  critical  and 
worthy  of  considerable  attention. 

The  sma ! I -address  problem  seems  inherent  in  mu  I i tprocessor s 
built  with  mini-  or  micro-computers.  Possibly  the  problem  will  be 
solved  by  avoiding  it.  Some  new  minis  are  appearing  on  the  market  now 
with  large  physical  address  spaces  but  maintaining  the  other 
attributes  of  a mini.  However,  a large  £,  ess  requires  many  bits, 
both  in  memory  to  retain  it  and  in  bandwh  n to  communicate  it. 

Ue  might  point  out  to  psychologists  that  the  problem  is  in 
essence  faced  by  a population  of  intercommunicating  humans.  No  one 
has  internal  symbols  (i.e.,  addresses)  designating  all  the  things 
that  all  individuals  designate  internally.  That  is.  they  do  without 
large  addresses  in  the  hardware.  Instead  they  use  language,  which 
is  a set  of  software-maintained  large  addresses,  for  their 
intercommunication.  They  continue  to  think  their  private  thoughts  in 
separate  representational  worlds.  Thus  the  problem  of  communicating 
with  small  addresses  is  a fundamental  one  not  restricted  to  the  world 
of  mu  1 1 1 -m i n i -processor s. 

The  time-constant  problem  seems  critical  if  we  are  to  make 
effective  use  of  multiprocessor  architectures.  We  must 
understand  what  various  patterns  of  relative  and  absolute  time 
constants  imply  for  the  processing  systems  built  on  top  of  them.  Only 
then  can  we  design  multiprocessors  with  a balance  between  their 
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with'wMch  ?uPabiMtieS  and  the  dynamic  capabilities  (i.e.,  the  speed 
1 k !V3rry  °Ut  V3ri0US  functions)-  What  can  be  done  within 
overheads  sn-h33  a'ready.been  desi9ned  is  still  quite  unclear.  Some 
OthPr!' 1 ! swapp i ng  t.»e.  are  built  into  the  hardware. 

between  flexibt?  Protectlon-  may  be  subject  to  ingenious  trade-offs 
between  flex.bihty  ana  computing  cost  (for  checking  protection).  For 

so  Thit’a°IIITn?2n  th;nkh°V°mpiling  °Ut  restricted  Protection  schemes 
so  that  a mini  mum  of  checking  has  to  be  done  dynamically. 

up  US  haV!  attemPted  t0  expose  a set  of  programming  issues  that 

we  have  encountered  m beginning  to  use  a mu  1 1 i -m i n i -processor . Ue 

sau  noth°Ur  U?Pamenta'  '9norance  of  the  correct  formulations  - to 
say  nothing  of  the  correct  solutions  - for  most  of  these  problems  in 

this  new  environment.  Perhaps  these  will  no  longer  look  like  the 

iTno^P  P"°b!ems  after  we  obtain  more  experience.  That  experience 
is  now  enveloping  ue  day  after  day. 
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